## [1] "Loading the following libraries using lb_myRequiredPackages: data.table"
## [2] "Loading the following libraries using lb_myRequiredPackages: lubridate" 
## [3] "Loading the following libraries using lb_myRequiredPackages: ggplot2"   
## [4] "Loading the following libraries using lb_myRequiredPackages: readr"     
## [5] "Loading the following libraries using lb_myRequiredPackages: plotly"    
## [6] "Loading the following libraries using lb_myRequiredPackages: knitr"

1 Purpose

To extract and visualise tweets and re-tweets of #dockercon for 17 - 21 April, 2017 (DockerCon17).

Borrowing extensively from http://thinktostart.com/twitter-authentification-with-r/

2 Load Data

Data should have been already downloaded using collectData.R. This produces a data table with the following variables (after some processing):

##  [1] "text"             "favorited"        "favoriteCount"   
##  [4] "replyToSN"        "created"          "truncated"       
##  [7] "replyToSID"       "id"               "replyToUID"      
## [10] "statusSource"     "screenName"       "retweetCount"    
## [13] "isRetweet"        "retweeted"        "longitude"       
## [16] "latitude"         "location"         "language"        
## [19] "profileImageURL"  "createdLocal"     "obsDateTimeMins" 
## [22] "obsDateTimeHours" "obsDateTime5m"    "obsDateTime10m"  
## [25] "obsDateTime15m"   "obsDate"          "isRetweetLab"

The table has 7,682 tweets (and 10,614 re-tweets) from 6,133 tweeters between 2017-04-16 19:01:03 and 2017-04-20 12:24:58 (Central District Time).

3 Analysis

3.1 Tweets and Tweeters over time

All (re)tweets containing #dockercon 2017-04-17 to 2017-04-20

All (re)tweets containing #dockercon 2017-04-17 to 2017-04-20

3.1.1 Day 1 - Monday (Workshops)

This plot is zoomable - try it!

All (re)tweets containing #dockercon Monday 17th April 2017

3.1.2 Day 2 - Tuesday (Main Day 1)

This plot is zoomable - try it!

All (re)tweets containing #dockercon Tuesday 18th April 2017

3.1.3 Day 3 - Wednesday (Main Day 2)

This plot is zoomable - try it!

All (re)tweets containing #dockercon Wednesday 19th April 2017

3.1.4 Day 4 - Thursday (Main Day 3)

All (re)tweets containing #dockercon Thursday 20th April 2017

All (re)tweets containing #dockercon Thursday 20th April 2017

3.2 Location (lat/long)

We wanted to make a nice map but sadly we see that most tweets have no lat/long set.

All logged lat/long values
latitude longitude nTweets
NA NA 18244
30.26416397 -97.73961067 2
30.26857 -97.73617 1
30.2625 -97.7401 29
30.26470908 -97.7417368 1
30.20226566 -97.66722505 1
42.36488267 -71.02168356 1
37.61697678 -122.38427689 1
30.2672 -97.7639 3
30.2635554 -97.7399303 1
30.2591 -97.7384 1
30.26622515 -97.74327721 1
30.26037 -97.73848 3
30.258201 -97.71264 1
30.25888 -97.73841 2
30.259714 -97.73940054 1
30.26006 -97.73813 1
30.26006 -97.73859 1
30.26036009 -97.73848483 1

3.3 Location (textual)

This appears to be pulled from the user’s profile although it may also be a ‘guestimate’ of current location.

Top locations for tweets:

Top 15 locations for tweeting
location nTweets
NA 2810
San Francisco, CA 1302
San Francisco 511
Austin, TX 337
Seattle, WA 229
Silicon Valley, CA 211
Paris 183
Islamabad, Pakistan 140
London 136
New York, NY 124
Charlotte, NC 119
San Jose, CA 118
Boston, MA 107
USA 105
west tokyo 104

Top locations for tweeters:

Top 15 locations for tweeters
location nTweeters
NA 1147
San Francisco, CA 176
Austin, TX 88
San Francisco 61
Seattle, WA 49
Paris 43
New York, NY 42
San Jose, CA 41
London, England 35
Paris, France 34
London 32
Palo Alto, CA 30
France 29
New York 28
Boston, MA 27

3.4 Screen name

Next we’ll try by screen name.

Top tweeters:

Top 15 tweeters
screenName nTweets
DockerCon 330
theCUBE 171
BettyJunod 132
climbingkujira 127
jpetazzo 124
solomonstre 111
jeanepaul 104
ManoMarks 95
kaslinfields 93
OpenShiftNinja 87
sitspak 80
SFoskett 80
vmblog 78
jameskobielus 76
bsmith626 73

And here’s a really bad visualisation of all of them!

N tweets per 5 minutes by screen name

N tweets per 5 minutes by screen name

So let’s re-do that for the top 50 tweeters.

N tweets per 5 minutes by screen name (top 50, most prolific tweeters at bottom)

N tweets per 5 minutes by screen name (top 50, most prolific tweeters at bottom)

4 About

Analysis completed in: 45.28 seconds using knitr in RStudio with R version 3.3.3 (2017-03-06) running on x86_64-apple-darwin13.4.0.

A special mention must go to twitteR (Gentry, n.d.) for the twitter API interaction functions and lubridate (Grolemund and Wickham 2011) which allows timezone manipulation without too many tears.

Other R packages used:

  • base R - for the basics (R Core Team 2016)
  • data.table - for fast (big) data handling (Dowle et al. 2015)
  • readr - for nice data loading (Wickham, Hester, and Francois 2016)
  • ggplot2 - for slick graphs (Wickham 2009)
  • plotly - fancy, zoomable slick graphs (Sievert et al. 2016)
  • knitr - to create this document (Xie 2016)

References

Dowle, M, A Srinivasan, T Short, S Lianoglou with contributions from R Saporta, and E Antonyan. 2015. Data.table: Extension of Data.frame. https://CRAN.R-project.org/package=data.table.

Gentry, Jeff. n.d. TwitteR: R Based Twitter Client. http://lists.hexdump.org/listinfo.cgi/twitter-users-hexdump.org.

Grolemund, Garrett, and Hadley Wickham. 2011. “Dates and Times Made Easy with lubridate.” Journal of Statistical Software 40 (3): 1–25. http://www.jstatsoft.org/v40/i03/.

R Core Team. 2016. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.

Sievert, Carson, Chris Parmer, Toby Hocking, Scott Chamberlain, Karthik Ram, Marianne Corvellec, and Pedro Despouy. 2016. Plotly: Create Interactive Web Graphics via ’Plotly.js’. https://CRAN.R-project.org/package=plotly.

Wickham, Hadley. 2009. Ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. http://ggplot2.org.

Wickham, Hadley, Jim Hester, and Romain Francois. 2016. Readr: Read Tabular Data. https://CRAN.R-project.org/package=readr.

Xie, Yihui. 2016. Knitr: A General-Purpose Package for Dynamic Report Generation in R. https://CRAN.R-project.org/package=knitr.